Goto

Collaborating Authors

 spoken language


Unsupervised Learning of Spoken Language with Visual Context

Neural Information Processing Systems

Humans learn to speak before they can read or write, so why can't computers do the same? In this paper, we present a deep neural network model capable of rudimentary spoken language acquisition using untranscribed audio training data, whose only supervision comes in the form of contextually relevant visual images. We describe the collection of our data comprised of over 120,000 spoken audio captions for the Places image dataset and evaluate our model on an image search and annotation task. We also provide some visualizations which suggest that our model is learning to recognize meaningful words within the caption spectrograms.


Reviews: Unsupervised Learning of Spoken Language with Visual Context

Neural Information Processing Systems

This is interesting work that is pointing into the right direction, but a few aspects of this paper are a bit problematic: 1) It would have been useful (or interesting) to use a corpus that has existing text captions, and either have users re-speak the text captions, or collect additional captions. The data collections seems generally well thought-out, but why was the Places205 data set used? Prompted speech (such as collected here) is not "spontaneous", otherwise the WSJ recognizer would not have given 20 % WER (this aspect is irrelevant for the purpose of this paper, though, I think). Typically, multiple captions are being generated for a single image. Has this been done here as well? Or is there only a single caption for each image?


The World's Top 10 Most Spoken Languages

#artificialintelligence

The Amazon growth story has been a remarkable one so far. On the top line, the company has grown every single year since its inception. Even in going back to 2004, Amazon generated a much more modest $6.9 billion in revenue compared to the massive $469 billion for 2021. Most of these sales come from their retail and ecommerce operations, which the company has come to be known for. That's because 74% of Amazon's operating profit comes from Amazon Web Services (AWS).


Ranked: The 100 Most Spoken Languages Around the World

#artificialintelligence

Even though you're reading this article in English, there's a good chance it might not be your mother tongue. Of the billion-strong English speakers in the world, only 33% consider it their native language. The popularity of a language depends greatly on utility and geographic location. Additionally, how we measure the spread of world languages can vary greatly depending on whether you look at total speakers or native speakers. Today's detailed visualization from WordTips illustrates the 100 most spoken languages in the world, the number of native speakers for each language, and the origin tree that each language has branched out from.


Unsupervised Learning of Spoken Language with Visual Context

Harwath, David, Torralba, Antonio, Glass, James

Neural Information Processing Systems

Humans learn to speak before they can read or write, so why can't computers do the same? In this paper, we present a deep neural network model capable of rudimentary spoken language acquisition using untranscribed audio training data, whose only supervision comes in the form of contextually relevant visual images. We describe the collection of our data comprised of over 120,000 spoken audio captions for the Places image dataset and evaluate our model on an image search and annotation task. We also provide some visualizations which suggest that our model is learning to recognize meaningful words within the caption spectrograms. Papers published at the Neural Information Processing Systems Conference.


Which Is The World's Most Spoken Language? Terpene. What's That, You Ask?

International Business Times

China, the most populous country in the world, has close to one billion people that speak Mandarin. Spanish is spoken by a less than half that number, primarily in Mexico, Spain and the countries in South America. English follows close behind, with Hindi in India and Arabic in the Middle East making up the top five. Or so you would think. The most common language in the world is actually not human at all.